-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce differentiation interface #2137
Conversation
@gdalle it would be nice if you could have a look at 64091bc. This already works very nicely! We might be able to get rid of our direct |
When testing dense Jacobian + ForwardDiff I get an error like this:
I wonder whether that has something to do with passing the AD backend to both |
Taking a look now |
core/src/model.jl
Outdated
@@ -28,6 +28,33 @@ struct Model{T} | |||
end | |||
end | |||
|
|||
function get_jac_eval(du::Vector, u::Vector, p::Parameters, solver::Solver) | |||
backend = if solver.autodiff | |||
AutoForwardDiff() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe you want to configure the tag here if it is available from somewhere else (perhaps the solver
?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
solver
is our own solver config object, but it works if I consistently specify the same tag everywhere 👍
core/src/util.jl
Outdated
p.all_nodes_active = false | ||
jac_prototype | ||
end | ||
|
||
# Custom overloads |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you want to get rid of the SCT dependency, you may want to
- put those overloads in an SCT package extension
- ask the user to provide an AD backend, and if it is an
AutoSparse
, retrieve itssparsity_detector
instead of providing your own
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm this probably doesn't fit our application as described in #2137 (comment). Keeping the SCT dependency is fine
# Custom overloads | ||
reduction_factor(x::GradientTracer, threshold::Real) = x | ||
reduction_factor(x::GradientTracer, ::Real) = x |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that this is explicitly discouraged in the SCT docs: see the following page to add overloads properly
https://adrianhill.de/SparseConnectivityTracer.jl/stable/internals/adding_overloads/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know I know 🙃
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some of these functions have more than 2 arguments, but for all of them we only care about the derivative with respect to one input. It's not clear to me whether/how that fits within the overload functionality
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It makes sense to use our mechanisms. They will generate a couple of superfluous methods, but I don't see the harm in that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be fair, this specific line of code looks harmless. But once you have more than a handful of overloads, you might want to create an SCT extension and follow our docs. The generated code will be more future proof and compatible with local and global Jacobian and Hessian tracers. Your current code only supports global Jacobians.
core/src/model.jl
Outdated
|
||
# Activate all nodes to catch all possible state dependencies | ||
p.all_nodes_active = true | ||
prep = prepare_jacobian((du, u) -> water_balance!(du, u, p, t), du, backend, u) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
prep = prepare_jacobian((du, u) -> water_balance!(du, u, p, t), du, backend, u) | |
prep = prepare_jacobian(water_balance!, du, backend, u, Constant(p), Constant(t)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the docs:
Another option would be creating a closure, but that is sometimes undesirable.
When is a closure undesirable?
gradient(f, backend, x, Constant(c))
gradient(f, backend, x, Cache(c))
In the first call, c is kept unchanged throughout the function evaluation. In the second call, c can be mutated with values computed during the function.
Our p
contains caches for in place computations in our RHS (hence the discussion on PreallocationTools
etc. in the related issue). does that mean that we should use Cache(p)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should SciMLStructures.jl
come in here for more granular control?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When is a closure undesirable?
With Enzyme in particular it can make things slower or even error. With other backends it doesn't make much of a difference, but explicitly laying out the Context
s also allows taking into account element types (eg. for handling translation to Dual
with Cache
s).
Our p contains caches for in place computations in our RHS (hence the discussion on PreallocationTools etc. in the related issue). does that mean that we should use Cache(p)?
Does p
contain anything whose value you care about?
In general, you might want to split it between a Constant
part and a Cache
part.
Should SciMLStructures.jl come in here for more granular control?
DI has no specific support for SciMLStructures
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gdalle I'm working on a refactor where e.g. the prep
definition now looks like this:
prep = prepare_jacobian(
water_balance!,
du,
backend,
u,
Constant(p_non_diff),
Cache(p_diff),
Constant(p_mutable),
Constant(t),
)
This now fails in the sparsity detection because of an attempt to write GradientTracer
values to a Vector{Float64}
field of p_diff::ParametersDiff
. I made ParametersDiff
parametric so ParametersDiff{GradientTracer{...}}
can exist, and I half expected this to be constructed internally. This probably worked before because of PreallocationTools
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I just saw this warning in the docs:
Most backends require any Cache context to be an AbstractArray.
Let's see what I can do with that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't look like that quickly solves the problem. I just naively subtyped ParametersDiff{T} <: AbstractVector{T}
. Maybe I need to overload some methods?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, that's a tricky one and it's indeed on me. If you don't want to wait for a DI fix (ETA ~ days), a short-term solution would be to use PreallocationTools and a closure, even if it makes Enzyme angry.
core/src/model.jl
Outdated
jac = | ||
(J, u, p, t) -> | ||
jacobian!((du, u) -> water_balance!(du, u, p, t), du, J, prep, backend, u) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
jac = | |
(J, u, p, t) -> | |
jacobian!((du, u) -> water_balance!(du, u, p, t), du, J, prep, backend, u) | |
jac(J, u, p, t) = jacobian!(water_balance!, du, J, prep, backend, u, Constant(p), Constant(t)) |
Love the confident commit naming. That's the spirit we wanna see |
core/src/model.jl
Outdated
diff_cache_SCT = | ||
zeros(GradientTracer{IndexSetGradientPattern{Int64, BitSet}}, length(diff_cache)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With this PR you can use SCT.jacobian_buffer
instead, and with the update to DI I'll make once that is merged, you probably won't need any tweak at all
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
@SouthEndMusic can you take the branch from JuliaDiff/DifferentiationInterface.jl#739 for a spin, see if it works? Warning I am aware that using |
@gdalle I took the main branch, and it indeed works 👍 |
@SouthEndMusic with the branch from JuliaDiff/DifferentiationInterface.jl#741 the |
The changes have been released |
Some runtimes of the
|
If the Jacobian is sparse, can you try other orders inside the |
What can be the effect of this? A different number of rhs calls required to compute the Jacobian? |
Yes, it influences the number of different colors with which the columns of the Jacobian are colored, and one color equals one function call (not exactly true with ForwardDiff though). This could accelerate the FiniteDiff version significantly if the natural coloring was suboptimal. |
You can call |
|
Heads up, DI v0.6.46 supports nested tuples and named tuples of arrays as caches, at least for the most common backends |
This is really nice, quite a bit of complexity can be removed now 👍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great stuff @SouthEndMusic. And thanks for all the help @gdalle.
I left some minor comments, but let's get this in soon, we can always refine later.
core/src/validation.jl
Outdated
@assert flow_rate_update.name == :flow_rate | ||
flow_rate_ = minimum(flow_rate_update.value.u) | ||
|
||
if flow_rate_ < 0.0 | ||
errors = true | ||
control_state = key[2] | ||
@error "$id_controlled flow rates must be non-negative, found $flow_rate_ for control state '$control_state'." | ||
@error "Negative flow rate(s) for $id_controlled, control state '$control_state' found." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@error "Negative flow rate(s) for $id_controlled, control state '$control_state' found." | |
@error "Negative flow rate(s) found." node_id=id_controlled control_state |
This adds a function `model.to_fews(region_dir)` that converts the network and results to files that Delft-FEWS can directly handle. It is marked as experimental for now. @gijsber is working on a Delft-FEWS configuration that can be used to visualize model results, to complement our existing tools. We'll likely add this configuration to this monorepo since it is generic. #2159 also pertains to this work. What is especially nice is the spatio-temporal support of Delft-FEWS, so we can make visualizations like this:  In theory we can support similar functionality with QGIS, but looking at the plots in #1369 this would likely need work in QGIS itself. So this is really a quick win to be able to inspect models better. --------- Co-authored-by: Maarten Pronk <git@evetion.nl>
Fixes #2134